class: center, middle, inverse, title-slide .title[ # ISA 444: Business Forecasting ] .subtitle[ ## 08: Forecasting Environment ] .author[ ###
Fadel M. Megahed, PhD
Professor
Farmer School of Business
Miami University
@FadelMegahed
fmegahed
fmegahed@miamioh.edu
Automated Scheduler for Office Hours
] .date[ ### Spring 2025 ] --- ## Quick Refresher of Last Class ✅ Explain the differences between wide vs. long format ✅ Use [seaborn](https://seaborn.pydata.org/generated/seaborn.relplot.html) to plot multiple time-series ✅ Convert a data set to Nixtla's long format (`unique_id`, `ds`, `y`) ✅ Use [UtilsForecast](https://nixtlaverse.nixtla.io/utilsforecast/index.html) to visualize multiple series --- ## Learning Objectives for Today's Class - Install and import Nixtla's libraries ([StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html), [MLForecast](https://nixtlaverse.nixtla.io/mlforecast/index.html), [NeuralForecast](https://nixtlaverse.nixtla.io/neuralforecast/docs/getting-started/introduction.html), [UtilsForecast](https://nixtlaverse.nixtla.io/utilsforecast/index.html), and [TimeGPT](https://nixtlaverse.nixtla.io/nixtla/docs/getting-started/introduction.html)) for forecasting - Distinguish fixed window from rolling-origin - Introduce forecast accuracy metrics (MAE, MAPE, RMSE) --- class: inverse, center, middle # The Nixtlaverse Open-Source Forecasting Libraries --- ## Nixtla's Forecasting Libraries <img src="data:image/png;base64,#../../figures/nixtlaverse.png" width="87%" style="display: block; margin: auto;" /> --- ## Nixtla's Forecasting Libraries Nixtla provides **several open-source Python libraries** (and a closed source **TimeGPT** tool accessible via API calls) for **scalable forecasting tasks**. These libraries are **relatively** easy to use and can be integrated into your future forecasting workflows: - **[StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html)** – Fast & scalable statistical models (`ARIMA`, `ETS`, etc.). - **[MLForecast](https://nixtlaverse.nixtla.io/mlforecast/index.html)** – Machine learning-based forecasting (e.g., `XGBoost`, `LightGBM`). - **[NeuralForecast](https://nixtlaverse.nixtla.io/neuralforecast/docs/getting-started/introduction.html)** – Deep learning models for time series (e.g., `NBEATS`, `NHITS`, and `TFT`). - **[UtilsForecast](https://nixtlaverse.nixtla.io/utilsforecast/index.html)** – Utility functions for plotting, evaluation, etc. - **[TimeGPT](https://nixtlaverse.nixtla.io/nixtla/docs/getting-started/introduction.html)** – an AI transformer-powered forecasting API that requires minimal tuning. These libraries enable **forecasting at scale, which you will need in practice**. .footnote[ <html> <hr> </html> **Note:** These libraries can be installed via `pip` and are described in detail in [nixtlaverse.nixtla.io](https://nixtlaverse.nixtla.io/). Use the left-hand navigation bar to explore the documentation for each library. ] --- class: inverse, center, middle # Fixed Window vs. Rolling-Origin --- ## The Fixed Window Evaluation Approach - **Fixed Window** is the simplest approach to splitting your time series data into a training and a testing/holdout set. - The **goal** is to **train your model on the training set** and **evaluate its performance on the testing set (the last `k` observations in the data)**. + .black[.bold[Note that this is quite different than traditional machine learning applications for cross sectional data.]] - The evaluation on the **testing/holdout** set can serve two purposes: + **Model Evaluation**: Assess the model's **performance on unseen data** (since it is not used during training, and hence acts as a proxy for the model's performance on future data). + **Model Selection**: Compare the **performance of different models** to select the **best one**. - Note that using this approach for model selection is **reasonable if your models do not involve hyperparameter tuning** (otherwise, you may overfit to the testing set). --- ## The Fixed Window Evaluation Approach <img src="data:image/png;base64,#08_forecasting_env_files/figure-html/fixed_window-1.png" width="100%" style="display: block; margin: auto;" /> --- ## The Rolling-Origin Evaluation Approach - **Rolling-Origin Evaluation** is a method for splitting time series data into training and testing sets where the testing sets move forward over time. - The **goal** is to **train the model on a subset of past observations** and evaluate its **performance on a future testing set at multiple time steps**. - The key difference from a fixed window approach is that the **testing set shifts forward, allowing for multiple evaluations**: + The **training set may expand (expanding window)** or **remain fixed (rolling window)**. + This ensures that model performance is assessed across **different points in time**. + It **reduces sensitivity to the initial split point** and provides a more **robust evaluation** of model performance over time. --- ## Expanding Window Evaluation (By 1 Month) <img src="data:image/png;base64,#../../figures/expanding_window_evaluation1mo.gif" width="100%" style="display: block; margin: auto;" /> --- ## Expanding Window Evaluation (By 12 Month) <img src="data:image/png;base64,#../../figures/expanding_window_evaluation12mo.gif" width="100%" style="display: block; margin: auto;" /> --- ## Rolling Non-Expanding Window Evaluation <img src="data:image/png;base64,#../../figures/nonexpanding_window_evaluation.gif" width="100%" style="display: block; margin: auto;" /> --- ## Cross Validation within the Nixtlaverse <img src="data:image/png;base64,#../../figures/ts_cross_validation_nixtla.png" width="63%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Note:** The `cross_validation` method can also be applied to other NixtlaForecast objects (e.g., `MLForecast`, `NeuralForecast`) to perform cross-validation for machine learning and deep learning models. See the [StatsForecast Cross Validation Tutorial](https://nixtlaverse.nixtla.io/statsforecast/docs/tutorials/crossvalidation.html) to access the page shown above. ] --- ## Activity: The `cross_validation` Method <div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'><iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='315' src='https://www.mentimeter.com/app/presentation/alph1gvzvcf291iyb9typ19e28i46ptv/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'></iframe></div> --- ## Recap of Fixed vs. Rolling-Origin - **Fixed Window**: + **Simplest approach** to splitting data into training and testing sets. + **Testing set is fixed** and **does not move forward** over time. + **Provides a single evaluation** of model performance. - **Rolling-Origin**: + **Testing set moves forward** over time. + **Training set may expand or remain fixed**. + **Provides multiple evaluations** of model performance. + **Reduces sensitivity to the initial split point**. + **More robust evaluation** of model performance over time. --- ## Recap of Fixed vs. Rolling-Origin - In practice, the **rolling-origin approach is preferred** for time series forecasting tasks since it mimics the real-world scenario of forecasting future data points. The choice of: + **Expanding vs. Rolling Window**, + **Window Size**, and + **Step Size** depends on the specific forecasting task and the data at hand. --- class: inverse, center, middle # Forecast Accuracy Metrics --- ## Model Performance Evaluation in the Nixtlaverse ``` python from utilsforecast.losses import * ``` ### `evaluate` ``` python evaluate (df:~AnyDFType, metrics:List[Callable], models:Optional[List[str]]=None, train_df:Optional[~AnyDFType]=None, level:Optional[List[int]]=None, id_col:str='unique_id', time_col:str='ds', target_col:str='y', agg_fn:Optional[str]=None) ``` .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Evaluation Documentation](https://nixtlaverse.nixtla.io/utilsforecast/evaluation.html) ] --- ## Model Performance Evaluation in the Nixtlaverse (Cont.) .font80[ <table><thead><tr><th></th><th><strong>Type</strong></th><th><strong>Default</strong></th><th><strong>Details</strong></th></tr></thead><tbody><tr><td>df</td><td>AnyDFType</td><td></td><td>Forecasts to evaluate.<br/>Must have <code>id_col</code>, <code>time_col</code>, <code>target_col</code> and models’ predictions.</td></tr><tr><td>metrics</td><td>List</td><td></td><td>Functions with arguments <code>df</code>, <code>models</code>, <code>id_col</code>, <code>target_col</code> and optionally <code>train_df</code>.</td></tr><tr><td>models</td><td>Optional</td><td>None</td><td>Names of the models to evaluate.<br/>If <code>None</code> will use every column in the dataframe after removing id, time and target.</td></tr><tr><td>train_df</td><td>Optional</td><td>None</td><td>Training set. Used to evaluate metrics such as <a href="https://Nixtla.github.io/utilsforecast/losses.html#mase" target="_blank" rel="noreferrer"><code>mase</code></a>.</td></tr><tr><td>level</td><td>Optional</td><td>None</td><td>Prediction interval levels. Used to compute losses that rely on quantiles.</td></tr><tr><td>id_col</td><td>str</td><td>unique_id</td><td>Column that identifies each serie.</td></tr><tr><td>time_col</td><td>str</td><td>ds</td><td>Column that identifies each timestep, its values can be timestamps or integers.</td></tr><tr><td>target_col</td><td>str</td><td>y</td><td>Column that contains the target.</td></tr><tr><td>agg_fn</td><td>Optional</td><td>None</td><td>Statistic to compute on the scores by id to reduce them to a single number.</td></tr><tr><td><strong>Returns</strong></td><td><strong>AnyDFType</strong></td><td></td><td><strong>Metrics with one row per (id, metric) combination and one column per model.<br/>If <code>agg_fn</code> is not <code>None</code>, there is only one row per metric.</strong></td></tr></tbody></table> ] .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Evaluation Documentation](https://nixtlaverse.nixtla.io/utilsforecast/evaluation.html) ] --- ## Losses The most important train signal is the forecast error, which is the difference between the observed value `\(y_{\tau}\)` and the prediction `\(\hat{y}_{\tau}\)`, at time `\(y_{\tau}\)`: `$$e_{\tau} = y_{\tau} - \hat{y}_{\tau} \quad \quad \tau \in \{t+1, \dots, t+H\}$$` The train loss summarizes the forecast errors in different evaluation metrics. .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- ## Scale-Dependent Errors: `mae` **MAE** measures the relative prediction accuracy by averaging the absolute deviations between forecasts and actual values. `$$\text{MAE}(y_{\tau}, \hat{y}_{\tau}) = \frac{1}{H} \sum_{\tau = t+1}^{t+H} |y_{\tau} - \hat{y}_{\tau}|$$` - **Interpretation**: Provides a straightforward measure of forecast accuracy; lower MAE indicates better performance. - **Characteristic:** Does not penalize larger errors more than smaller ones; treats all errors equally. .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- ## Scale-Dependent Errors: `rmse` **RMSE** is the square root of the average of the squared differences between forecasts and actual values. `$$\text{RMSE}(y_{\tau}, \hat{y}_{\tau}) = \sqrt{\frac{1}{H} \sum_{\tau = t+1}^{t+H} (y_{\tau} - \hat{y}_{\tau})^2}$$` - **Interpretation**: Emphasizes larger errors due to squaring; useful when large errors are particularly undesirable. - **Characteristic:** Penalizes large errors more than MAE (i.e., more sensitive to ourliers compared to MAE). .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- ## Percentage Errors: `mape` **MAPE** calculates the average absolute error as a percentage of actual values. `$$\text{MAPE}(y_{\tau}, \hat{y}_{\tau}) = \frac{1}{H} \sum_{\tau = t+1}^{t+H} \left| \frac{y_{\tau} - \hat{y}_{\tau}}{y_{\tau}} \right|$$` **Interpretation**: Expresses forecast accuracy as a percentage; lower MAPE indicates better performance. **Characteristic:** Can be misleading if actual values are close to zero, leading to extremely high MAPE values. .footnote[ <html> <hr> </html> **Source:** [Nixtla's UtilsForecast Losses Documentation](https://nixtlaverse.nixtla.io/utilsforecast/losses.html) ] --- class: inverse, center, middle # Recap --- ## Summary of Main Points By now, you should be able to do the following: - Install and import Nixtla's libraries ([StatsForecast](https://nixtlaverse.nixtla.io/statsforecast/index.html), [MLForecast](https://nixtlaverse.nixtla.io/mlforecast/index.html), [NeuralForecast](https://nixtlaverse.nixtla.io/neuralforecast/docs/getting-started/introduction.html), [UtilsForecast](https://nixtlaverse.nixtla.io/utilsforecast/index.html), and [TimeGPT](https://nixtlaverse.nixtla.io/nixtla/docs/getting-started/introduction.html)) for forecasting - Distinguish fixed window from rolling-origin - Introduce forecast accuracy metrics (MAE, MAPE, RMSE) --- ## 📝 Review and Clarification 📝 1. **Class Notes**: Take some time to revisit your class notes for key insights and concepts. 2. **Zoom Recording**: The recording of today's class will be made available on Canvas approximately 3-4 hours after the session ends. 3. **Questions**: Please don't hesitate to ask for clarification on any topics discussed in class. It's crucial not to let questions accumulate.